9 research outputs found
LLL. Algoritme de reducciĂł de bases de xarxes
Treballs Finals de Grau de Matemà tiques, Facultat de Matemà tiques, Universitat de Barcelona, Any: 2017, Director: Artur Travesa i Grau[en] The algorithm LLL is a strong tool for reducing lattice bases in polinomical time introduced by Arjen Lenstra, Hendrik Lenstra and László Lovász in 1982. We will study it’s implementation, as well as proof it’s polinomical time behaviour. Finally, we will show it’s use in factorizing factorizing polynomials with rational coefficients and some computational examples
Us vs. Them: A Dataset of Populist Attitudes, News Bias and Emotions
Computational modelling of political discourse tasks has become an
increasingly important area of research in natural language processing.
Populist rhetoric has risen across the political sphere in recent years;
however, computational approaches to it have been scarce due to its complex
nature. In this paper, we present the new dataset,
consisting of 6861 Reddit comments annotated for populist attitudes and the
first large-scale computational models of this phenomenon. We investigate the
relationship between populist mindsets and social groups, as well as a range of
emotions typically associated with these. We set a baseline for two tasks
related to populist attitudes and present a set of multi-task learning models
that leverage and demonstrate the importance of emotion and group
identification as auxiliary tasks.Comment: Camera-ready version in EACL 202
Cross-lingual AMR Aligner: Paying Attention to Cross-Attention
This paper introduces a novel aligner for Abstract Meaning Representation
(AMR) graphs that can scale cross-lingually, and is thus capable of aligning
units and spans in sentences of different languages. Our approach leverages
modern Transformer-based parsers, which inherently encode alignment information
in their cross-attention weights, allowing us to extract this information
during parsing. This eliminates the need for English-specific rules or the
Expectation Maximization (EM) algorithm that have been used in previous
approaches. In addition, we propose a guided supervised method using alignment
to further enhance the performance of our aligner. We achieve state-of-the-art
results in the benchmarks for AMR alignment and demonstrate our aligner's
ability to obtain them across multiple languages. Our code will be available at
\href{https://www.github.com/Babelscape/AMR-alignment}{github.com/Babelscape/AMR-alignment}.Comment: ACL 2023. Please cite authors correctly using both lastnames
("Mart\'inez Lorenzo", "Huguet Cabot"
Incorporating Graph Information in Transformer-based AMR Parsing
Abstract Meaning Representation (AMR) is a Semantic Parsing formalism that
aims at providing a semantic graph abstraction representing a given text.
Current approaches are based on autoregressive language models such as BART or
T5, fine-tuned through Teacher Forcing to obtain a linearized version of the
AMR graph from a sentence. In this paper, we present LeakDistill, a model and
method that explores a modification to the Transformer architecture, using
structural adapters to explicitly incorporate graph information into the
learned representations and improve AMR parsing performance. Our experiments
show how, by employing word-to-node alignment to embed graph structural
information into the encoder at training time, we can obtain state-of-the-art
AMR parsing through self-knowledge distillation, even without the use of
additional data. We release the code at
\url{http://www.github.com/sapienzanlp/LeakDistill}.Comment: ACL 2023. Please cite authors correctly using both lastnames
("Mart\'inez Lorenzo", "Huguet Cabot"
RED: a Filtered and Multilingual Relation Extraction Dataset
Relation Extraction (RE) is a task that identifies relationships between
entities in a text, enabling the acquisition of relational facts and bridging
the gap between natural language and structured knowledge. However, current RE
models often rely on small datasets with low coverage of relation types,
particularly when working with languages other than English. In this paper, we
address the above issue and provide two new resources that enable the training
and evaluation of multilingual RE systems. First, we present SRED,
an automatically annotated dataset covering 18 languages, 400 relation types,
13 entity types, totaling more than 40 million triplet instances. Second, we
propose RED, a smaller, human-revised dataset for seven languages
that allows for the evaluation of multilingual RE systems. To demonstrate the
utility of these novel datasets, we experiment with the first end-to-end
multilingual RE model, mREBEL, that extracts triplets, including entity types,
in multiple languages. We release our resources and model checkpoints at
https://www.github.com/babelscape/rebelComment: ACL 2023. Please cite authors correctly using both lastnames ("Huguet
Cabot", "Ngonga Ngomo"
REBEL: Relation Extraction By End-to-end Language generation
Extracting relation triplets from raw text is a crucial task in Information Extraction, enabling multiple applications such as populating or validating knowledge bases, factchecking, and other downstream tasks. However, it usually involves multiple-step pipelines that propagate errors or are limited to a small number of relation types. To overcome these issues, we propose the use of autoregressive seq2seq models. Such models have previously been shown to perform well not only in language generation, but also in NLU tasks such as Entity Linking, thanks to their framing as seq2seq tasks. In this paper, we show how Relation Extraction can be simplified by expressing triplets as a sequence of text and we present REBEL, a seq2seq model based on BART that performs end-to-end relation extraction for more than 200 different relation types. We show our model’s flexibility by fine-tuning it on an array of Relation Extraction and Relation Classification benchmarks, with it attaining state-of-the-art performance in most of them